35 research outputs found

    An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

    Get PDF
    Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient \textbf{A}ccelerated \textbf{D}ecentralized stochastic algorithm for \textbf{F}inite \textbf{S}ums named ADFS, which uses local stochastic proximal updates and randomized pairwise communications between nodes. On nn machines, ADFS learns from nmnm samples in the same time it takes optimal algorithms to learn from mm samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples mm, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments.Comment: Code available in source files. arXiv admin note: substantial text overlap with arXiv:1901.0986

    Who Started This Rumor? Quantifying the Natural Differential Privacy of Gossip Protocols

    Get PDF
    Gossip protocols (also called rumor spreading or epidemic protocols) are widely used to disseminate information in massive peer-to-peer networks. These protocols are often claimed to guarantee privacy because of the uncertainty they introduce on the node that started the dissemination. But is that claim really true? Can the source of a gossip safely hide in the crowd? This paper examines, for the first time, gossip protocols through a rigorous mathematical framework based on differential privacy to determine the extent to which the source of a gossip can be traceable. Considering the case of a complete graph in which a subset of the nodes are curious, we study a family of gossip protocols parameterized by a "muting" parameter s: nodes stop emitting after each communication with a fixed probability 1-s. We first prove that the standard push protocol, corresponding to the case s = 1, does not satisfy differential privacy for large graphs. In contrast, the protocol with s = 0 (nodes forward only once) achieves optimal privacy guarantees but at the cost of a drastic increase in the spreading time compared to standard push, revealing an interesting tension between privacy and spreading time. Yet, surprisingly, we show that some choices of the muting parameter s lead to protocols that achieve an optimal order of magnitude in both privacy and speed. Privacy guarantees are obtained by showing that only a small fraction of the possible observations by curious nodes have different probabilities when two different nodes start the gossip, since the source node rapidly stops emitting when s is small. The speed is established by analyzing the mean dynamics of the protocol, and leveraging concentration inequalities to bound the deviations from this mean behavior. We also confirm empirically that, with appropriate choices of s, we indeed obtain protocols that are very robust against concrete source location attacks (such as maximum a posteriori estimates) while spreading the information almost as fast as the standard (and non-private) push protocol

    Toward a route detection method base on detail call records

    Get PDF
    In the last years, smartphones have become the major device for communication enabling Telco operators to capture subscribers’ whereabouts. This location information allows computing eostatistics to study transportation systems, traffic jams, origin-destination matrix, etc. The first task to accomplish the aforementioned objectives is to detect routes that people use to go from A to B. Thus, in the present effort, we propose a method to extract automatically routes from CDR data relying on clustering and community detection algorithms

    Who started this rumor? Quantifying the natural differential privacy guarantees of gossip protocols

    Get PDF
    Gossip protocols are widely used to disseminate information in massive peer-to-peer networks. These protocols are often claimed to guarantee privacy because of the uncertainty they introduce on the node that started the dissemination. But is that claim really true? Can the source of a gossip safely hide in the crowd? This paper examines, for the first time, gossip protocols through a rigorous mathematical framework based on differential privacy to determine the extent to which the source of a gossip can be traceable. Considering the case of a complete graph in which a subset of the nodes are curious, we study a family of gossip protocols parameterized by a ``muting'' parameter ss: nodes stop emitting after each communication with a fixed probability 1s1-s. We first prove that the standard push protocol, corresponding to the case s=1s=1, does not satisfy differential privacy for large graphs. In contrast, the protocol with s=0s=0 achieves optimal privacy guarantees but at the cost of a drastic increase in the spreading time compared to standard push, revealing an interesting tension between privacy and spreading time. Yet, surprisingly, we show that some choices of the muting parameter ss lead to protocols that achieve an optimal order of magnitude in both privacy and speed. We also confirm empirically that, with appropriate choices of ss, we indeed obtain protocols that are very robust against concrete source location attacks while spreading the information almost as fast as the standard (and non-private) push protocol

    Asynchrony and Acceleration in Gossip Algorithms

    Get PDF
    This paper considers the minimization of a sum of smooth and strongly convex functions dispatched over the nodes of a communication network. Previous works on the subject either focus on synchronous algorithms, which can be heavily slowed down by a few slow nodes (the straggler problem), or consider a model of asynchronous operation (Boyd et al., 2006) in which adjacent nodes communicate at the instants of Poisson point processes. We have two main contributions. 1) We propose CACDM (a Continuously Accelerated Coordinate Dual Method), and for the Poisson model of asynchronous operation, we prove CACDM to converge to optimality at an accelerated convergence rate in the sense of Nesterov et Stich, 2017. In contrast, previously proposed asynchronous algorithms have not been proven to achieve such accelerated rate. While CACDM is based on discrete updates, the proof of its convergence crucially depends on a continuous time analysis. 2) We introduce a new communication scheme based on Loss-Networks, that is programmable in a fully asynchronous and decentralized way, unlike the Poisson model of asynchronous operation that does not capture essential aspects of asynchrony such as non-instantaneous communications and computations. Under this Loss-Network model of asynchrony, we establish for CDM (a Coordinate Dual Method) a rate of convergence in terms of the eigengap of the Laplacian of the graph weighted by local effective delays. We believe this eigengap to be a fundamental bottleneck for convergence rates of asynchronous optimization. Finally, we verify empirically that CACDM enjoys an accelerated convergence rate in the Loss-Network model of asynchrony

    Dual-Free Stochastic Decentralized Optimization with Variance Reduction

    Get PDF
    We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a 1/n1/n fraction of the dataset, where nn is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data

    An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

    Get PDF
    Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient Accelerated Decentralized stochastic algorithm for Finite Sums named ADFS, which uses local stochastic proximal updates and randomized pairwise communications between nodes. On n machines, ADFS learns from nm samples in the same time it takes optimal algorithms to learn from m samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples m, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments

    Who started this rumor? Quantifying the natural differential privacy guarantees of gossip protocols

    Get PDF
    International audienceGossip protocols are widely used to disseminate information in massive peer-to-peer networks. These protocols are often claimed to guarantee privacy because of the uncertainty they introduce on the node that started the dissemination. But is that claim really true? Can the source of a gossip safely hide in the crowd? This paper examines, for the first time, gossip protocols through a rigorous mathematical framework based on differential privacy to determine the extent to which the source of a gossip can be traceable. Considering the case of a complete graph in which a subset of the nodes are curious, we study a family of gossip protocols parameterized by a ``muting'' parameter s: nodes stop emitting after each communication with a fixed probability 1-s. We first prove that the standard push protocol, corresponding to the case s=1, does not satisfy differential privacy for large graphs. In contrast, the protocol with s=0 achieves optimal privacy guarantees but at the cost of a drastic increase in the spreading time compared to standard push, revealing an interesting tension between privacy and spreading time. Yet, surprisingly, we show that some choices of the muting parameter s lead to protocols that achieve an optimal order of magnitude in both privacy and speed. We also confirm empirically that, with appropriate choices of s, we indeed obtain protocols that are very robust against concrete source location attacks while spreading the information almost as fast as the standard (and non-private) push protocol
    corecore